162 research outputs found
Understanding Deep Convolutional Networks
Deep convolutional networks provide state of the art classifications and
regressions results over many high-dimensional problems. We review their
architecture, which scatters data with a cascade of linear filter weights and
non-linearities. A mathematical framework is introduced to analyze their
properties. Computations of invariants involve multiscale contractions, the
linearization of hierarchical symmetries, and sparse separations. Applications
are discussed.Comment: 17 pages, 4 Figure
Wavelet Scattering on the Pitch Spiral
We present a new representation of harmonic sounds that linearizes the
dynamics of pitch and spectral envelope, while remaining stable to deformations
in the time-frequency plane. It is an instance of the scattering transform, a
generic operator which cascades wavelet convolutions and modulus
nonlinearities. It is derived from the pitch spiral, in that convolutions are
successively performed in time, log-frequency, and octave index. We give a
closed-form approximation of spiral scattering coefficients for a nonstationary
generalization of the harmonic source-filter model.Comment: Proceedings of the 18th International Conference on Digital Audio
Effects (DAFx-15), Trondheim, Norway, Nov 30 - Dec 3, 2015, pp. 429--432. 4
pages, 3 figure
Audio Texture Synthesis with Scattering Moments
We introduce an audio texture synthesis algorithm based on scattering
moments. A scattering transform is computed by iteratively decomposing a signal
with complex wavelet filter banks and computing their amplitude envelop.
Scattering moments provide general representations of stationary processes
computed as expected values of scattering coefficients. They are estimated with
low variance estimators from single realizations. Audio signals having
prescribed scattering moments are synthesized with a gradient descent
algorithms. Audio synthesis examples show that scattering representation
provide good synthesis of audio textures with much fewer coefficients than the
state of the art.Comment: 5 pages, 2 figure
Generative networks as inverse problems with Scattering transforms
Generative Adversarial Nets (GANs) and Variational Auto-Encoders (VAEs)
provide impressive image generations from Gaussian white noise, but the
underlying mathematics are not well understood. We compute deep convolutional
network generators by inverting a fixed embedding operator. Therefore, they do
not require to be optimized with a discriminator or an encoder. The embedding
is Lipschitz continuous to deformations so that generators transform linear
interpolations between input white noise vectors into deformations between
output images. This embedding is computed with a wavelet Scattering transform.
Numerical experiments demonstrate that the resulting Scattering generators have
similar properties as GANs or VAEs, without learning a discriminative network
or an encoder.Comment: International Conference on Learning Representations, 201
Rigid-Motion Scattering for Texture Classification
A rigid-motion scattering computes adaptive invariants along translations and
rotations, with a deep convolutional network. Convolutions are calculated on
the rigid-motion group, with wavelets defined on the translation and rotation
variables. It preserves joint rotation and translation information, while
providing global invariants at any desired scale. Texture classification is
studied, through the characterization of stationary processes from a single
realization. State-of-the-art results are obtained on multiple texture data
bases, with important rotation and scaling variabilities.Comment: 19 pages, submitted to International Journal of Computer Visio
Phase retrieval for the Cauchy wavelet transform
We consider the phase retrieval problem in which one tries to reconstruct a
function from the modulus of its wavelet transform. We study the unicity and
stability of the reconstruction. In the case where the wavelets are Cauchy
wavelets, we prove that the modulus of the wavelet transform uniquely
determines the function up to a global phase. We show that the reconstruction
operator is continuous but not uniformly continuous. We describe how to
construct pairs of functions which are far away in -norm but whose wavelet
transforms are very close, in modulus. The principle is to modulate the wavelet
transform of a fixed initial function by a phase which varies slowly in both
time and frequency. This construction seems to cover all the instabilities that
we observe in practice; we give a partial formal justification to this fact.
Finally, we describe an exact reconstruction algorithm and use it to
numerically confirm our analysis of the stability question.Comment: Acknowledgments update
Classification with Scattering Operators
A scattering vector is a local descriptor including multiscale and
multi-direction co-occurrence information. It is computed with a cascade of
wavelet decompositions and complex modulus. This scattering representation is
locally translation invariant and linearizes deformations. A supervised
classification algorithm is computed with a PCA model selection on scattering
vectors. State of the art results are obtained for handwritten digit
recognition and texture classification.Comment: 6 pages. CVPR 201
Deep Learning by Scattering
We introduce general scattering transforms as mathematical models of deep
neural networks with l2 pooling. Scattering networks iteratively apply complex
valued unitary operators, and the pooling is performed by a complex modulus. An
expected scattering defines a contractive representation of a high-dimensional
probability distribution, which preserves its mean-square norm. We show that
unsupervised learning can be casted as an optimization of the space contraction
to preserve the volume occupied by unlabeled examples, at each layer of the
network. Supervised learning and classification are performed with an averaged
scattering, which provides scattering estimations for multiple classes.Comment: 10 pages, 1 figur
Geometric Models with Co-occurrence Groups
A geometric model of sparse signal representations is introduced for classes
of signals. It is computed by optimizing co-occurrence groups with a maximum
likelihood estimate calculated with a Bernoulli mixture model. Applications to
face image compression and MNIST digit classification illustrate the
applicability of this model.Comment: 6 pages, ESANN 201
Deep Scattering Spectrum
A scattering transform defines a locally translation invariant representation
which is stable to time-warping deformations. It extends MFCC representations
by computing modulation spectrum coefficients of multiple orders, through
cascades of wavelet convolutions and modulus operators. Second-order scattering
coefficients characterize transient phenomena such as attacks and amplitude
modulation. A frequency transposition invariant representation is obtained by
applying a scattering transform along log-frequency. State-the-of-art
classification results are obtained for musical genre and phone classification
on GTZAN and TIMIT databases, respectively
- …